Introduction#

This lesson covers optimizations and tradeoffs to meet the proposed non-functional requirements. We also discuss response time, which is a key factor in our service's efficiency. Let's see how we meet the non-functional requirements, especially when dealing with global player interactions.

Non-functional requirements#

We are achieving the non-functional requirements of the game API as follows in the sections below.

Availability#

By separating gameplay and CRUD operations on game assets, we ensure service decoupling and improve service availability. Our solution ensures that game configurations are on healthy servers through regular checks and backups. In the unlikely event of a server failure, we can quickly rebuild the session from the most recent backup. Furthermore, we can cap the maximum number of players who can join a game to manage resources efficiently.

Scalability#

Regionally distributed clusters make scaling services easier. This is a scalable and cost-effective approach because the cluster controller manages and handles multiple game servers and adding and removing game servers is much easier. Asynchronous communication between the cluster controller and game services enables efficient resource management and load balancing during periods of increased traffic.

Security#

We implement authentication/authorization by using a login mechanism. Joining a game lobby requires a JWT access token, which is only issued when there is enough space for players or teams willing to join the game. The JWT token also helps identify the user and the user's privileges in the game. When a game ends, the game server shares game stat updates with the game service via a private cloud to avoid any data manipulation. We also have a patch controller for security and version updates, ensuring that every client joining the game has installed the services' critical updates. To prevent in-game hacking and cheating, each client periodically synchronizes its state with the server. Furthermore, our API validates state changes on the server before merging and broadcasting to other players.

Low latency#

We achieve low latency by sending as little data (state changes) as possible over the wire. For global tournaments, we replicate contestant data (weapon appearance, avatars, and other defaults) across regionally distributed clusters so we don’t have to send this information across regions during the gameplay. We implement lag compensation and client move prediction to reduce the user-perceived latency and improve the gameplay experience. We also buffer and prefetch a small data threshold before showing it to other players. Additionally, data is transferred over a dedicated VPC channel with preallocated bandwidth and is specifically configured for high-priority requests to keep latency as low as possible.

Achieving Non-Functional Requirements

Non-Functional Requirements

Approaches


Availability

  • Separation of gameplay from CRUD operations
  • Max limiting players in a room request
  • Server health check for early fault detection


Scalability

  • Regionally distributed clusters
  • Transmitting data through CDNs
  • Asynchronous approach for back-end communication



Security

  • Login mechanism for authentication and authorization
  • Access token (JWT) for players to join a room
  • Game statistics updated only from the game server
  • Server-side game state validation
  • Patch controller for security and version updates
  • Single sign-on support for other social media accounts



Low latency

  • Routing the state changes to the game server only
  • Replicating data of regional players
  • Lag compression and client-side prediction
  • Buffering and prefetching data before showing
  • Using VPC for data transfer
  • Suitable choice of data format and architecture style depending upon the requirement

Latency budget#

Let's estimate the latency of the following three main operations performed by our game API.

Note: As discussed in the back-of-the-envelope calculations for latency, for POST requests, the RTTRTT time changes with the data size by1.15 ms1.15\ ms per KB after the base RTT time—the minimum RTT taken by a request with smallest data size—which was 260 ms. Moreover, the time to download the response to a request varies by 0.4 ms0.4\ ms per KB.

Initiate game session#

Clients send POST requests to initiate a game session. Let's estimate the request and response sizes to calculate the response time for this request.

Request and response size#

We assume that a payload size of 11 KB is attached to the request, which contains client-device information such as the installed game version, playerId, gameMode, playerDefaults, and other game settings. The response returned by the server is estimated to be roughly 7 KB and contains the joinURL server, the server's exposed port, matchId, roomId, the hash identifier assigned to the player, and so on. Let's use these sizes to calculate the response time in the next section.

Request size=11 KBResponse size=7 KB Request\ size = 11\ KB \\ Response\ size = 7\ KB

Response time#

Let's use the calculator below to estimate the response time for initiating the game session on a game server.

Response Time Calculator for the Initiation of a Game Session

Enter request size in KBs11KB
Enter response size in KBs7KB
Minimum latencyf395.95ms
Maximum latencyf476.95ms
Minimum response timef399.95ms
Maximum response timef480.95ms

The calculations above are made assuming that we have a POST request of size 11 KB and a response of size 7 KB.

Latencymin=Timebase_min+RTTpost+TimedownloadLatency_{min} = Time_{base\_min} + RTT_{post} + Time_{download}

Latencymin=120.5+(260+1.15×11)+(0.4×7)Latency_{min} = 120.5 + (260 + 1.15 \times 11) + (0.4 \times 7)

Latencymin395.95 msLatency_{min} \approx 395.95\ ms

Similarly:

Latencymax=Timebase_max+RTTpost+TimedownloadLatency_{max} = Time_{base\_max} + RTT_{post} + Time_{download}

Latencymax=201.5+(260+1.15×11)+(0.4×7)476.95 msLatency_{max} = 201.5 + (260 + 1.15 \times 11) + (0.4 \times 7) \approx 476.95\ ms

To calculate the response time:

Responsemin=Latencymin+ProcessingminResponse_{min}=Latency_{min}+Processing_{min}

Responsemin=395.95 ms+4 ms399.95 msResponse_{min}=395.95\ ms+4\ ms\approx 399.95\ ms

Similarly:

Responsemax=Latencymax+ProcessingmaxResponse_{max}=Latency_{max}+Processing_{max}

Responsemax=476.95 ms+4 ms480.95 msResponse_{max}=476.95\ ms+4\ ms\approx 480.95\ ms

Note: Although the above numbers may seem high for a real-time communication application like a game, it should be noted that these requests are made before the game starts, and latency is not critical for such initial requests. Once the actual game starts, it's important to keep latency to a minimum to ensure a smooth gaming experience.

Start playing#

In order to join a gaming room, clients send GET requests. Let's see how the operation is carried out and how long it takes to finish.

Request and response size#

We know from the previous lesson that the actual gameplay is performed using the WebSocket protocol. Since the upgrade to WebSocket doesn’t require a body in the request and response messages, we assume the messages to be 1 KB in size, including header fields such as host, authentication, status code, and so on.

Response time#

For an HTTP GET request, calculated with a processing time of 4 ms, a base time of 201.5 ms, and a round trip time of 70 ms, we get a switching protocol response time of 275.9 ms, as shown below:

Response time for protocol upgrade request
Response time for protocol upgrade request

Send events#

Let's calculate the latency between clients and the game server when sending and receiving events.

Message size#

We use binary format to represent event information such as moves or actions performed by the player. Therefore, the sent data would be in the range of a few hundred bytes (say 300 bytes.)

Message sizeoutgoing=300 bytes0.29297 KB Message\ size_{outgoing}= 300\ bytes \approx 0.29297\ KB

The size of the incoming message depends on the changes made by different players in that game to the global game state maintained on the server. Let's roughly calculate the overall state change by taking the maximum number of players allowed in the same room (100) and assuming they all make a similar change of 300 bytes.

Message sizeincoming=300 bytes×100=30000 bytes29.297 KB Message\ size _{incoming}=300\ bytes \times 100 = 30000\ bytes \approx 29.297\ KB

Response time#

We may disregard elements like base time, request compilation time, etc., because we already know the connection has been formed and upgraded to WebSockets. Therefore, the following formula can be used to calculate the latency for transmitting event messages:

Latency=Message size×0.4+Base propagation delayLatency = Message\ size\times0.4+ Base\ propagation \ delay
Latencyoutgoing=0.29297×0.4+35=35.12 ms Latency_{outgoing} = 0.29297\times0.4+ 35= 35.12\ ms

Similarly:

Latencyincoming=29.297×0.4+35=39.334 msLatency_{incoming} = 29.297\times0.4+ 35= 39.334\ ms

Let's use the following calculator to add the processing time taken by the game server to sync up all the states:

Response Time Calculator for Sending and Receiving Game State

Enter message size in KBs0.3KB
Total number of players100Integer
Estimated outgoing event latencyf35.12ms
Estimated incoming event latencyf39.3344ms
User-perceived response timef78.45439999999999ms

The following illustration summarizes the overall response time required to exchange game events:

Response time for sending and receiving game events
Response time for sending and receiving game events

Note: A latency of 50–100 ms is considered adequate for gaming. The response time calculated using the calculator above is the average response of servers located in different parts of the world and is adequate.

Optimization and tradeoffs#

There is a tight coupling between the client and the server that validates the game state during a session. For a set of clients belonging to the same region, it may be plausible to provide a smoother experience. However, as we scale, it becomes a challenge to provide the same level of experience to clients with varying configurations, networks, and devices. This calls for some optimization techniques, which we list below:

  • Dedicating resources: We can allocate resources for special events to ensure low ping and consistent response times. For example, a dedicated portion of the VPC bandwidth is provided for a global tournament while regular game mode can utilize the remaining bandwidth until the tournament ends. This is a tradeoff that highly depends on the business strategy of the organization.

  • UDP-based communication: TCP provides reliable communication, ensuring no data is lost during transmission. However, it can cause high latency when packets are lost or go out of order. For such cases, we can use UDP-based communication to reduce latency where small data losses can be tolerated. We can even opt for protocols like HTTP/3 or RUDP, which are based on UDP but provide reliable communication, but their use is not yet standardized. However, their use is still common in the gaming industry, even though these protocols lack features such as prefetching, buffering, and other performance-enhancing techniques. So it is a tradeoff between latency and added complexity to make UDP reliable.

Note: Applications may also add ordering and selective reliability to UDP on the application layer.

Summary#

In this chapter, we learned how we can design the API of a gaming system by first highlighting its requirements. Based on the established requirements, we designed an end-to-end communication underlining the important decisions. Next, we focused on the unique endpoints for the gaming API. Finally, we learned how our API implements non-functional requirements, such as addressing client-side cheats/hacks, and real-time communication.

API Model for Gaming Service

What Causes API Failures